Using RStudio on Hoffman2

Hoffman2 Happy Hour

Charles Peterson

🎉 Welcome to Hoffman2 Happy Hours

🎉 Welcome to the Hoffman2 Happy Hours

  • Short presentations on HPC-related topics and practical uses of Hoffman2
  • Thoughts for future “Happy Hour” topics? 💡

📧 cpeterson@oarc.ucla.edu

📖 Access the Workshop Files

This presentation and accompanying materials are available on UCLA OARC GitHub Repository 🔗 https://github.com/ucla-oarc-hpc

You can view the slides:

Clone the repository for workshop files:

git clone https://github.com/ucla-oarc-hpc/H2HH_rstudio

RStudio Information

🖥️ What is Rstudio

A powerful IDE for R, data visualization, and script management


But why do you want to use RStudio on Hoffman2 when you can use your own computer???

RStudio on Hoffman2 provides access to:

  • Higher Memory
  • multi-core processing
  • GPUs
  • Your Hoffman2-hosted data

🌐 RStudio Formats

There are two main (free) RStudio formats that researchers can use


🖥️ RStudio Desktop

  • Standalone desktop application

  • Installed locally on your machine

🌐 RStudio Server

  • Run RStudio as a server process on Hoffman2
  • Open on a web browser

🚀 RStudio on Hoffman2

RStudio Desktop can be inefficient on Hoffman2

  • require X11 forwarding
  • slow
  • sluggish interaction

RStudio Server is the best way to use RStudio on Hoffman2

Running RStudio

Running RStudio (1)

Get An Interactive Job

Containers cannot run on login nodes.

  • You MUST use a compute node


qrsh -l h_data=10G

Modify the qrsh to meet your RStudio computing needs

  • More memory and/or job time
qrsh -l h_data=50G,h_rt=5:00:00
  • More cores
qrsh -l h_data=10G -pe shared 10
  • Using GPUs
qrsh -l h_data=10G,gpu,V100

Running RStudio (2)

Create Temp Directories

  • Create writable temp directories
    • RStudio writes small files
    • Anywhere you have write access



mkdir -pv $SCRATCH/rstudiotmp/var/lib
mkdir -pv $SCRATCH/rstudiotmp/var/run
mkdir -pv $SCRATCH/rstudiotmp/tmp

Running RStudio (3)

Load the Apptainer Module

  • Apptainer is software that will run the Rstudio container


module load apptainer

RStudio Server on Hoffman2 created from Docker

  • Separate R from modules on Hoffman2
    • DO NOT load R modules
    • R packages may need to be reinstalled

Running RStudio (4)

Start Up RStudio

apptainer run \
 -B $SCRATCH/rstudiotmp/var/lib:/var/lib/rstudio-server \
 -B $SCRATCH/rstudiotmp/var/run:/var/run/rstudio-server \
 -B $SCRATCH/rstudiotmp/tmp:/tmp \
 $H2_CONTAINER_LOC/h2-rstudio_4.1.0.sif
  • apptainer run
    • Starts the RStudio container
  • -B $SCRATCH/rstudiotmp/[dir]:[/dir]
    • Mounts tmp directories to the container
  • $H2_CONTAINER_LOC/h2-rstudio_4.1.0.sif
    • Location of RStudio container
    • Can be change to different RStudio versions
  • Information will display about RStudio session
    • Note the compute node name and port number.
    • Displays ssh -N -L ... info to be ran
    • You will see a Rstudio Password
      • Needed to open Rstudio

Note

KEEP THIS TERMINAL OPEN UNTIL YOU JOB IS DONE

Running RStudio (5)

  • Open another terminal on your local computer

  • Run the port forward command

    • Creates a connection from local computer to compute node
ssh  -N -L 8787:nXXX:8787 username@hoffman2.idre.ucla.edu 
  • Change port 8787 if needed
  • nXXX is the compute node name
  • username is your Hoffman2 username

Running RStudio (6)

  • Finally, open a web browser
    • Type URL of RStudio Server
    • Will ALWAYS be localhost
    • Change port 8787 if needed
http://localhost:8787

Running Rstudio - The Easy Way

  • h2_rstudio.sh
    • Script that runs everything from the previous slide
    • Starts Rstudio and opens a web browser for you
    • Runs on your local computer (not Hoffman2)

h2-studio.sh Information

Look at our Github page

  • Download script
wget https://raw.githubusercontent.com/ucla-oarc-hpc/H2-RStudio/main/h2_rstudio.sh
chmod +x h2_rstudio.sh
  • To display how to use this script
./h2_rstudio.sh -h
  • Run script
    • Replace username with Hoffman2 username
./h2_rstudio.sh -u username

Tested Platforms

Mac’s terminal app

Window’s WSL2

MoboXterm

GitBash

RStudio Script

This RStudio Script is currently on our GitHub page

Info on this RStudio Container (1)

  • Rstudio container was built using Docker
    • Based on RStudio images from the Rocker Project
    • Hoffman2 containers located at $H2_CONTAINER_LOC
    • RStudio containers are named:
      • h2-rstudio_X.Y.Z.sif
      • Where X.Y.Z is the R version
  • View all available RStudio containers by running
module load apptainer
ls $H2_CONTAINER_LOC/h2-rstudio*sif

Info on this RStudio Container (2)

  • Separate build of R and
  • R packages installed in unique directory
    • ~/R/APPTAINER/h2-rstudio_4.1.0 (for h2_rstudio-4.1.0.sif)
  • HPC Container files
    • Docker and definition files for Hoffman2 containers
    • RStudio Dockerfiles have all you need to build RStudio

R Package Installs

  • Some R packages require extra libraries or software in the container
  • Contact us to update this container
    • OR you can modify the Dockerfile for your own container

Tips for Running RStudio (1)

  • If Rstudio does not at start up
    • Possibly due to previous RStudio not shutdown correctly
  • Clear out any tmp directories config files
rm -rf $SCRATCH/rstudiotmp
  • Clear out RStudio config files
rm -rf ~/.config/rstudio

Tips for Running RStudio (2)

  • Access to a Hoffman2 terminal in RStudio

Using Batch R

  • Instead of interactive RStudio, you can run R as a non-interactive batch job
    • Use R from inside RStudio container as a qsub job
  • Create a job script
    • Load Apptainer
    • use RStudio container with a .R script
      • apptainer run
#!/bin/bash
#$ -cwd
#$ -o rstudio_batch.out.$JOB_ID
#$ -j y
#$ -l h_rt=3:00:00,h_data=10G
#$ -pe shared 1

# Load Apptainer module
. /u/local/Modules/default/init/modules.sh
module load apptainer

# Run R with a R script, named myRtest.R
apptainer run $H2_CONTAINER_LOC/h2-rstudio_4.1.0.sif R CMD BATCH myRtest.R
  • Then run this job script
qsub rstudio_batch.job

Summary

  • Utilize RStudio Server on Hoffman2
    • Access through on your web browser
    • Applicable to other HPC resource as well
  • RStudio can be used interactively or as a non-interactive batch job
  • Use the h2_rstudio.sh script for easy setup

🙏 Thanks and Happy Computing!

Questions? Comments?